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ABSTRACT 

A  new  on-line  method  is  presented  for  estimation  of  the  angular  random  walk 
and  rate  random  walk  coefficients  of  IMU  (Inertial  Measurement  Unit)  gyros 
and  accelerometers.  The  on-line  method  proposes  a  state  space  model  and 
proposes  parameter  estimators  for  quantities  previously  measured  from  off¬ 
line  data  techniques  such  as  the  Allan  variance  graph.  Allan  variance  graphs 
have  large  off-line  computational  effort  and  data  storage  requirements.  The 
technique  proposed  here  requires  no  data  storage  and  computational  effort  of 
0(100)  calculations  per  data  sample. 
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On-line  Estimation  of  Allan  Variance  Parameters 


EXECUTIVE  SUMMARY 


Inertial  navigation  is  an  important  technology  in  which  measurements  provided  by  gyro¬ 
scopes  and  accelerometers  are  used  to  determine  position  for  modern  aircraft.  Because  of 
the  importance  of  these  devices  in  modern  navigation,  a  full  understanding  of  their  noise 
characteristics  is  important. 

The  most  commonly  used  tool  for  analyzing  the  noise  characteristics  of  gyros  and  ac¬ 
celerometers  is  the  Allan  variance  graph  method  which  suffers  from  several  deficiencies. 
The  Allan  variance  graph  method  requires  the  data  to  be  collected  off-line  before  process¬ 
ing  can  be  performed.  It  also  requires  the  user  to  read  slopes  and  intercepts  manually  off 
the  Allan  variance  graph  so  that  noise  characteristics  can  be  calculated. 

The  key  contribution  of  this  paper  is  a  new  technique  that  can  analyze  data  as  it  is  received 
and  calculates  on-line  the  contribution  from  various  noise  sources.  The  proposed  method, 
unlike  the  Allan  variance  graph  method,  does  not  require  large  amounts  of  data  to  be 
stored.  The  proposed  method  also  has  the  additional  advantage,  if  the  data  set  is  large, 
of  requiring  less  computational  effort  overall. 

Convergence  results  are  presented  for  the  proposed  technique  based  on  the  law  of  large 
numbers  and  an  ordinary  differential  equation  approach.  The  technique  is  verified  on 
both  simulated  data  and  data  obtained  from  a  HG1700-AE11  Honeywell  IMU  (Inertial 
Measurement  Unit)  by  comparison  with  the  results  from  an  Allan  variance  graph  analysis. 

The  new  methodology  represents  a  fundamental  improvement  in  the  testing  procedures  for 
inertial  navigation  systems  and  will  enable  a  more  efficient  response  to  ADF  requirements 
for  IMU  modelling  in  support  of  weapon  system  performance  assessments. 
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1  Introduction 

Inertial  navigation  is  the  process  by  which  measurements  made  using  gyroscopes  and 
accelerometers  are  used  to  determine  position]!,  2].  Inertial  navigation  is  an  essential 
technology  for  modern  aircraft  and  an  important  tool  for  navigation  in  general.  Ac¬ 
celerometer  and  gyro  IMUs  (Inertial  Measurement  units)  suffer  from  particular  types  of 
noise  interferences  which  induce  IMU  navigation  errors  which  grow  with  time.  Because 
of  the  importance  of  systems  with  these  characteristics,  a  vast  amount  of  analysis  of  this 
type  of  noise  has  been  performed,  including:  frequency  fluctuations  in  atomic  clocks[3], 
noise  from  ring  laser  gyros[4]  and  gyro  sensors  in  general[5,  6]. 

Characterizing  noise  produced  by  IMU  sensors  by  a  single  RMS  number  was  overly  con¬ 
servative  in  the  short  term  and  did  not  adequately  model  the  longer  term  error  growth. 
The  Allan  variance  method  was  developed  to  better  characterize  the  noise  model  and  is 
now  the  standard  method  of  analysis[6j.  The  Allan  variance,  A(N),  of  a  sensor  output  is 
the  variance  of  the  means  of  successive  subsets  of  the  data  of  size  N. 

When  the  Allan  variance,  A(N),  is  plotted  against  the  sample  size,  N,  particular  noise 
features  in  the  gyro  output  appear  as  structures  in  the  Allan  variance  curve,  as  illustrated 
in  [4,  6,  7]  and  Figure  5  in  this  paper. 

There  are  several  disadvantages  of  the  Allan  variance  technique.  Obviously,  by  its  very 
nature,  it  is  an  off-line  process  and  requires  a  large  amount  of  data  to  be  stored.  Addi¬ 
tionally,  the  technique  requires  that  lines  of  best  fit  be  manually  placed  on  the  graph  and 
intercepts  to  be  read  off  to  obtain  estimates  of  the  noise  contributions.  Unfortunately, 
the  Allan  variance  technique  does  not  characterize  the  noise  well  when  contributions  from 
various  sources  overlap  on  the  Allan  variance  graph. 

This  paper  works  in  the  stochastic  model  framework  proposed  in  [5].  From  this  stochastic 
model,  we  develop  a  state  space  model  that  models  only  the  noise  contributions  from  the 
angular  random  walk  (ARW)  and  rate  random  walk  (RRW). 

The  key  contribution  of  this  paper  is  the  proposal  of  on-line  parameter  estimation  al¬ 
gorithms  for  the  ARW  and  RRW  constants.  Global  and  local  convergence  results  are 
established  in  several  stages  using  the  law  of  large  numbers  and  an  ordinary  differential 
equation  approach.  Unlike  the  Allan  variance  technique,  the  proposed  parameter  estima¬ 
tors  do  not  require  large  amounts  of  data  to  be  stored  and  only  require  approximately  100 
calculations  per  time  step.  For  larger  data  sets,  greater  than  10000,  the  proposed  on-line 
technique  is  computationally  superior  to  the  Allan  variance  technique  (a  typical  data  set 
might  contain  24hrs  of  1Hz  measurements,  ie.  ~  107  data  points) 

Simulation  studies  are  presented  both  for  computer  generated  data  and  for  data  from  a 
Honeywell  IMU.  Results  from  the  on-line  estimation  algorithm  are  compared  to  estimates 
from  the  Allan  variance  technique. 

This  paper  is  organized  as  follows:  In  Section  2,  we  introduce  a  state  space  model  for 
angular  and  rate  random  walks.  In  Section  3,  parameter  estimators  are  introduced  and  a 
partial  convergence  result  is  established  in  the  situation,  admittedly  artificial,  when  the 
error  model  state  variable  is  measurable.  In  Section  4,  we  introduce  a  conditional  mean 
estimate  and  an  estimation  algorithm  for  the  more  realistic  situation  where  only  output 
measurements  are  available.  Local  convergence  results  are  established  using  an  ordinary 
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differential  equation  approach.  In  Section  5,  implementation  issues  are  discussed  including 
initialization,  computational  effort  and  techniques  for  improving  convergence  rates  from 
poor  initializations.  Simulation  studies  are  presented  in  Section  6.  Finally,  in  Section  7 
some  conclusions  are  presented. 


2  Stochastic  Model 


Gyro  sensor  error  dynamics  can  be  modelled  by  the  discrete-time  stochastic  model  pre¬ 
sented  in  Figure  1,  see  also  [7,  4].  In  this  model  there  are  contributions  from  angular 
random  walk  noise  {Wo),  rate  random  walk  noise  (Wjg),  flicker  noise  (F)  and  ramp  noise 
(R).  Not  shown  on  the  figure  are  contributions  from  quantization  noise,  Markov  noise,  or 
sinusoidal  errors. 


Angular 
random 
waik  w 
VVD 


Flicker  Ramp 

noise  noise 


Figure  1  (U):  Stochastic  Model 


The  terms  “angular  random  walk”  and  “rate  random  walk”  are  slightly  misleading  because 
angle  rate  AS  is  often  measured  directly  and  these  terms  would  normally  be  seen  from 
a  linear  system’s  viewpoint  as  measurement  noise  and  process  noise  (which  produces  a 
classical  random  walk  in  the  output). 

In  this  paper,  we  propose  a  simplified  stochastic  model  in  which  the  only  non-zero  con¬ 
tributions  are  the  angular  random  walk  and  rate  random  walk  terms.  This  simplification 
seems  appropriate  for  some  of  the  sensors  that  have  been  examined  via  the  alternative 
Allan  variance  graph  technique.  Consider  Figure  5  in  this  paper  which  shows  the  Allan 
Variance  plot  for  the  output  of  the  gyro  examined  in  this  paper.  The  dominate  noise 
sources  appear  to  be  an  angular  random  walk  component  (the  —  \  slope  part  of  the  curve) 
and  a  rate  random  walk  component  (the  +i  slope  curve).  A  discussion  of  Allan  variance 
plots  can  be  found  in  [4,  6,  7]. 

The  simplified  stochastic  model  shown  in  Figure  2  can  be  represented  as  a  state  space 
model.  The  unknown  angular  random  walk  and  rate  random  walk  coefficients  are  param¬ 
eters  of  this  representation.  Estimation  of  these  quantities  can  then  be  examined  from  a 
linear  system’s  viewpoint[8,  9,  10].  In  the  following  section  we  define  our  mathematical 
framework  for  the  parameter  estimation  problem. 
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Figure  2  (U):  Simplified  stochastic  model 


2.1  State  Space  Model 

Consider  a  probability  measure  space  (Cl,P,P).  Here  Cl  is  an  arbitrary  space  or  set 
of  points  w,  T  is  a-algebra  in  Cl  (a  class  of  subsets  containing  Cl  and  closed  under  the 
formation  of  complements  and  countable  unions)  and  P  is  a  probability  measure  on  T. 
See  [11,  Pages  17-23]  for  more  details.  In  some  sense,  each  point  u  corresponds  to  a  triple 
({u^},  {ve}i  xo)>  where  wk,  vk  and  xq  are  defined  below.  Suppose  {a^},  l  G  Z+  (Z+  is 
the  set  of  positive  integers)  is  a  discrete-time  linear  stochastic  process,  taking  values  in  R, 
with  dynamics  given  by 

xk+i  =  x k  +  vB  wk+ 1,  x0  G  R  (2.1) 

Here  k  G  Z+ ,  \fB  is  the  rate  random  walk  coefficient  and  {u;/},  i  G  Z+,  is  a  sequence 
of  independent  and  identically  distributed  N( 0, 1)  scalar  random  variables.  The  notation 
N( 0, 1)  is  short  hand  for  a  random  variable  whose  density  is  Gaussian  with  zero  mean  and 
variance  1. 

The  state  process  x (,  t  G  Z+,  is  observed  indirectly  via  the  scalar  observation  process 
{ye}:  l  €  £+,  given  by 

Vk  -  xk  +  VD  vkeR  (2.2) 

Here  k  G  Z+,  \fT)  is  the  angular  random  walk  coefficient  and  {u^},  t  G  Z+,  is  a  sequence 
of  independent  and  identically  distributed  iV(0, 1)  scalar  random  variables.  We  assume 
that  xo,  {wi}  and  {v^}  are  mutually  independent.  The  sequence  {yk}  is  the  sequence 
of  angle  rates  measured  by  the  IMU  (or  Ad  in  Figure  2).  The  variance  of  noise  Wb  is 
B  and  the  variance  of  noise  Wo  is  D.  We  have  considered  here  only  single  axis  devices 
but  in  principle  this  technique  can  be  extended  to  3  dimensions  with  B  and  D  becoming 
covariance  matrices. 

Let  Qk  and  34  denote  the  complete  filiations  generated  by  {x(}  and  {yi}  respectively. 
For  example,  Qk  is  the  cr-algebra  generated  by  xq,x\,  . . .  ,xk,  denoted  a{xo,xi, . . . ,  rrfc}, 
augmented  by  including  all  subsets  of  events  of  probability  zero.  See  [12]  for  more  details 
about  filiations. 

The  model  described  by  (2.1), (2.2)  is  denoted  by 


A  =  \(B,  D,xq) 


(2.3) 


In  this  paper  we  will  present  convergence  results  that  hold  almost  surely.  We  will  say 
that  in  a  probability  space  (Cl,T,P)  a  result  holds  almost  surely  (a.s.)  if  it  holds  with 
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probability  1 ,  or  equivalently  that  it  holds  for  all  to  in  an  .F-set  of  probability  1  where 
w  £  SI.  We  are  concerned  with  convergence  so  consider  the  following  illustration.  Let 
z,zi,Z2,...  be  random  variables.  If 

lim  z/c(u> )  =  z(u>)  (2.4) 

k—>  oo 

for  all  to  in  an  F-set  of  probability  1,  then 

lim  z/c  =  z  almost  surely  or  a.s.  (2.5) 

k-+oo 

A  weaker  convergence  condition  is  that 

lim  P[\zk(to)  —  2(u>)|  >  e]  =  0  (2.6) 

k— >oo 

which  is  called  convergence  in  probability  and  is  equivalent  to  mean  square  convergence. 
Note  that  almost  sure  convergence  implies  convergence  in  probability  but  convergence  in 
probability  does  not  imply  almost  sure  convergence  . 


3  Parameter  Estimation  -  Full  Observations 


In  this  section  we  assume  that  both  {rr/J  and  {y/J  are  fully  observed.  The  results  in 
this  section  for  the  full  observation  case  are  presented  as  a  stepping  stone  to  the  more 
interesting  and  general  results  that  follow. 

From  (2.1),  with  simple  manipulation  then  squaring  both  sides  and  summing  over  k  we 
obtain, 

k  k 

=  ^(Xi  -  Ij_i)2.  (3.1) 

2=1 


Now  consider  the  matrices 

Jk 


From  (3.1),  we  see  that 


£****-!.  ok  =  Y,xh 

i=l  t=l 

k  k 

Hk  =  XM-l  alld  Mk  =  YjW<i- 

t=  1  i=  1 


BMk  =  Ok  +  Hk  —  2Jk 


(3.2) 


(3.3) 


Similarly,  from  (2.2)  with  simple  manipulation  then  squaring  both  sides  and  summing  over 
k  we  obtain, 

(3-4) 

1=1  1=1 


Now  consider  the  matrices 


k  k 

Tk  =  Qk  =  T,yl  and 

2=1  2=1 

2=1 


(3.5) 
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From  (3.4),  we  see  that 


DVk  =  Qk  +  Ok  -  2 Tfc. 


(3.6) 


3.1  Almost  Sure  Convergence 


Lemma  1  Consider  the  linear  system  (2.1), (2. 2)  denoted  A.  Suppose  {xk}  and  {y*,}  exist 
and  are  measured,  then 

lim  Bk,Dk=B,D  a.s.  (3.7) 

k— >oo 

where 


Bk 


£>k 


Ok  +  Hk  -  2  Jfc 
k 

Qk  +  Ok  -  2 Tk 

k 


and 


Proof:  Consider  estimate  Bk  first. 

Rewrite  the  estimate  as 

6  _BMk 


(3.8) 


By  the  strong  law  of  large  numbers  [11,  Page  85],  lim^oo  ^  Mk  =  1  a.s..  The  lemma 
results  follows  from  (3.8).  The  lemma  result  for  £>k  follows  similarly. 

□ 


3.2  Finite  Data  Sets 

Lemma  2  Consider  the  linear  system  (2.1),  (2. 2)  denoted  X.  Suppose  finite  date  sets 
(xo, •  •  • ,  xt}  o,nd  (yo> yi)  •  •  • ,  yr}  ore  measured,  then 

P(\Bt  -  B\  >  e)  <  ^5 

P(\Dt  -  D\  >  e)  <  (3.9) 


Proof:  Follows  from  a  similar  argument  as  Lemma  1  by  using  the  weak  law  of  large 
number[ll,  Page  86].  □ 


Remarks 

1.  Lemma  2  is  a  weaker  convergence  result  than  Lemma  1  in  the  sense  that  in  the  limit 
as  T  — )■  oo  a  convergence  in  probability  result  rather  than  an  almost  sure  convergence 
result  is  established. 
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4  Conditional  Mean  Estimates 

In  this  section  we  consider  parameter  estimation  based  on  conditional  mean  estimates 
in  lieu  of  the  true  states.  We  define  a  model  set,  A  :=  {X(B,D,xq)\B,D,xq  G  R},  of 
allowable  model  estimates  A*,  G  A  and  assume  the  true  model,  A,  lies  in  the  model  set  A. 
We  denote  the  history  or  sequence  of  estimates  as  A*  :=  {Ai, . . . ,  A*}.  Let  us  denote  the 
associated  conditional  mean  estimates  based  on  the  model  estimates,  Ajt,  as 

^k\k,Kk  =  Ok\k,Ak  =  E[Ok\yk,  A/c]i 

and  (4.1) 

Optimal  finite  dimensional  recursions  for  these  conditional  mean  estimates  can  be  found 
in  [13].  Hence,  we  propose  estimators  for  B  and  D  based  on  observations  {yk}  as: 


Bk\fu  =  Pr°3 


°k\kM  +  Hk\kJAk 


(Qk  +  Ok\k,Kk  2Bk\k,kk 


where 

•^fc+i  =  an(^ 

Rk+ 1  =  {Ai,...,Afc+i},  (4.4) 

and  proj(x)  =  max(6,x)  where  0  <  <5  <  B,D  is  a  small  constant,  and  ensures  Bk and 
Dk\^k  remain  positive. 

We  consider  first  the  situation  in  which  conditional  mean  estimates  based  on  the  true 
model  A  are  available. 

Lemma  3  Consider  the  linear  system  (2.1), (2. 2)  denoted  by  A.  Suppose  {yk}  exists  and 
is  measured.  Also  assume  that  conditional  mean  estimates  based  on  the  “correct  model” 
are  available.  Assume  the  true  model  A  satisfies  A  G  A.  Then 

Km  Bk\ktx,£>k\k,x  =  B,D  a.s.  (4.5) 


Proof:  The  proof  follows  Lemma  1.  We  consider  estimation  of  B  first.  We  rewrite  the 
estimate  (4.2)  without  the  projection  step  as 

f>  E[BMk\yk,X]  DE[Mk\yk,X] 

Bk\\  =  - £ - =  B - £ - •  (4-b) 

Hence,  the  estimate  is 

%  .  b(*M) 


-*(■ 


„„  '  Mk  ' 
BE  yk,  X 


=  BE\E\-fgkyk, x  yk, a  . 
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Hence, 

lim  Bk )A  =  B  a.s..  (4.7) 

Here,  we  have  used  that  E[AjAi]  =  .E[I?[X|A2]|Ai]  when  A\  C  A2  and  the  strong  law  of 
large  number  result  lim^oo  \E[Mk\Qk]  =  1  a.s..  Also,  because  (f>k  —  \E  [Mk\Qk,yk,  A]  is 
uniformly  integrable  (that  is  E[<t>k<f>'k\  is  bounded)  and  lim*^,*,  <f>k  =  1  a.s.  then  there  is 
convergence  in  conditional  mean: 

lim  E[4>k\yk,  A]  =  1  a.s.  . 


The  lemma  result  follows  for  Dk\\  similarly. 


□ 


Remarks 

1.  A  finite  date  set  result  which  extends  the  above  Lemma  can  be  established  using  the 
weak  law  of  large  numbers. 

The  following  theorem  holds. 


Theorem  1  Consider  the  linear  system  (2.1), (2. 2)  denoted  by  X.  Consider  a  sequence 
of  estimated  models  Kk  adaptively  updated  by  previous  parameter  estimates  as  shown  in 
(4-4)-  Then  the  recursion  converges  almost  surely  to  the  set  S  (or  possibly  the  projection 
boundaries  B  —  5  and  D  =  6),  where 


S  :=  local  argmini? 

B,D 


{[xk  ~  Zfc-i)2  ~  -B)  +  {{yk  ~  xkf  -dY  |A,34 


=  \2 


(4.8) 


where  X  =  X(B^D^Xq).  That  is,  S  contains  the  local  minimum  of  the  cost  function 
E  [((xfc  -  xk^f  -Bf  +  {(yk  -  xk)2  -  D )2  |  A,  yk] . 


Proof  First  we  consider  estimation  of  B  only.  Simple  manipulation  of  (4.2)  gives  a 
recursion  for  the  estimate  Bk^k  ^  as  follows: 

=  Bk- 1  +  —  ( AOk  +  A  Hk  —  2A  Jk  —  Bk-i)  (4.9) 

where  we  denote  Bk ,k  ^  by  the  shorthand  notation  Bk  and  define  AOk  :=  Ok^k  — 
Ok_i\k-i  j  and  similarly  define  AHk  and  AJk.  To  simplify  the  presentation  we  will 
ignore  the  projection  step.  Convergence  results  will  still  hold  if  projection  is  performed, 
see  Ljung  [14]. 

We  follow  the  technique  presented  in  [8].  This  proof  relies  on  the  ordinary  differential 
approach  described  in  [14,  15,  16].  In  the  ODE  approach,  convergence  of  a  difference 
equation  is  established  by  considering  the  convergence  properties  of  an  associated  ordinary 
differential  equation. 
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Convergence  of  (4.9)  can  be  analyzed  by  considering  the  following  ordinary  differential 
equation, 

=  i  f(B(r,k),k)  (4.10) 

where  k  is  here  a  fixed  parameter.  Also,  with  J3(r,  k)  abbreviated  as  B,  we  define  f(B,  k) 
as  follows: 

f(B,  k)  :=  E[AOklB  +  A Hk]B  -  2A Jk]B  -  B]  (4.11) 

Here  we  have  explicitly  shown  in  the  notation  used  that  A Ok  etc.  depend  on  the  parameter 
estimate  B. 

In  Ljung  [14],  it  is  stated  that  (4.9)  will  converge  to  the  set  S  or  a  boundary,  where 
S  :=  {B |  limjt-Hx,  f{B,  k)  =  0},  if  the  following  hold: 

1.  Certain  regularity  and  exponential  stability  conditions  hold  as  listed  in  [14]. 

2.  The  ODE  (4.10)  (which  is  parameterized  by  k )  is  asymptotically  stable  in  the  limit 
as  k  -4  oo. 


The  satisfy  condition  1  we  require  the  conditional  mean  filters  for  Jk,k  Ok[k  Hk\k 

and  ^k\k,kk  to  be  exponentially  forgetting  (which  has  not  been  shown  yet  but  which  we  will 
assume  to  be  the  case)  and  regularity  conditions  for  this  system  are  shown  and  discussed 
in  [8,  14]. 

To  establish  condition  2  we  use  a  Lyapunov  function  approach.  Consider  the  function, 


W(B,k)  =  -E 


((xfc  -  Xfc_i)2  -  B^j 


(4.12) 


It  follows  from  classical  expectation  results,  including  that  E[jE[X|Al2]|Ai]  =  jB[X|Ai] 
when  A\  C  A2,  that  W(B,k)  >  0.  Under  asymptotic  ergodicity  and  certain  smoothness 
conditions  the  differentiation  w.r.t.  B  and  the  expectation  operations  can  be  interchanged. 
Hence, 

dW(B{r,k),k) 


dB(r,k) 


=  -f(B(r,k),k) 


(4.13) 


and  it  then  follows  that 


dWjBjr ,  fc),fc) 
dr 


dW(B(T,fc),A:)dB(r,  fc) 
dB(r,k)  dr 

- f(B{r ,  fc),  k)^f(B{r,  k),  k). 


(4.14) 


It  follows  that  W(B,k)  is  a  Lyapunov  function. 

It  follows  from  Lynapov’s  direct  method  and  equation  (4.13)  that  B (r,  k)  converges  to  the 
set  {B|  lim^oo  f{B,  k)  =  0}  (or  possibly  to  the  boundary  B  =  <5  if  a  projection  step  is 
implemented).  Convergence  of  the  difference  equation  (4.9)  follows  from  Ljung[14]. 

We  note  that  under  asymptotic  ergodicity  (and  other  regularity  conditions  given  in  Ljung 
[14]  and  shown  in  [8])  and  using  the  results  in  Ljung  [14]  that  the  difference  equation 
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(4.9)  and  the  ODE  (4.10)  converge  to  the  same  set.  That  is,  in  the  limit  k  -4  oo 
convergence  to  the  set  {JB|  lim^oo  f(B,  k)  =  0}  is  equivalent  to  the  local  minimum  of 
E[{(xk-xk^)2-B)2\B,yk]. 

The  result  for  simultaneous  estimation  follows  using  a  similar  approach  by  writing  the 
estimate  Dk |fe  as  a  recursion  and  using  the  Lyapunov  function, 


W(B,D,h)  =  UE[((xk-xk^)2-B)2]+E[((yk-xk)2~Df}).  (4.15) 


Note  that  only  a  local  convergence  result  is  established  in  the  simultaneous  estimation 
case  because  the  set  S  may  contain  more  than  one  point. 

□ 


Remarks 

1.  This  establishes  local  convergence  to  the  true  model  because  it  follows  from  Lemma  3 
that  A  e  S. 

2.  Convergence  rates  have  not  been  established  but  can  be  shown  using  an  approach 
similar  to  [8]. 

3.  Local  convergence  results  for  estimation  from  finite  date  sets  can  be  established  in  the 
off-line  situation  when  the  recursions  (4.2)-(4.3)  are  passed  over  the  data  set  multiple 
times  and  the  model  estimate  is  updated  after  each  pass.  In  this  multi-pass  off¬ 
line  situation  the  estimators  are  an  example  of  the  EM  (expectation-maximization) 
algorithm  and  local  convergence  results  are  available  [17].  Convergence  results  in  the 
situation  of  a  single  pass  through  a  finite  data  set  which  are  analogous  to  Lemma  2 
have  not  yet  been  established. 


5  Implementation 

In  this  section  we  discuss  some  implementation  issues  relating  to  the  above  algorithm. 


Filters 

To  implement  the  estimator  defined  by  (4.2)-(4.4)  requires  the  quantities  Jk^k  Ok<k 
H k\k  and  Tfc!fc  ^  as  defined  in  (4.1)  to  be  calculated.  Optimal  finite  dimensional  recur¬ 
sions  for  Jklk  Kk,  dfc|fcAfe ,  Hk]kkk  and  ffe|fc  Afc  are  given  in  [13]. 


On-line  or  batch  processing 

It  is  possible  to  use  the  algorithm  presented  in  this  paper  in  both  an  on-line  and  batch 
manner.  The  recursion  (4.2)- (4.3)  can  be  iterated  as  each  new  data  point  arrives  to  produce 
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a  new  estimate.  Alternatively,  if  only  a  finite  data  set  is  available  then  the  algorithm  can 
be  iterated  through  the  data  set  multiple  times  with  the  parameter  estimates  improving 
on  each  successive  pass  through  the  data. 

In  a  multi-pass,  situation  convergence  results  follow  by  considering  the  data  set  as  the 
concatenation  of  copies  of  the  finite  data  set  (sometimes  know  as  the  periodic  extension  of 
a  finite  signal)  and  then  convergence  occurs  to  a  minimum  of  (4.15).  It  has  been  shown  that 
resolution  errors  result  from  working  on  a  finite  date  set[18].  These  resolution  errors  will 
be  apparent  on  this  concatenated  data  set  and  estimation  bias  will  result.  The  estimation 
error  will  consist  of  two  components  as  follows: 

( BklkM  -  B)2  =  Bl  +  Bv  (5.1) 

The  error  Bl  is  a  bias  due  to  using  a  finite  data  set  and  By  is  the  estimation  variance  of 
the  estimator.  The  estimation  variance  By  can  be  reduced  by  increasing  the  number  of 
passes  through  the  data  while  Bl  depends  only  on  the  finite  data  set. 


Initialization 

The  algorithm  requires  an  initial  guess  for  the  ARW  and  RRW  coefficients.  In  our  experi¬ 
ence  the  algorithm  will  converge  to  the  true  ARW  and  RRW  coefficients  from  a  reasonable 
initialization.  In  particular,  for  the  studies  presented  below,  convergence  to  the  true  val¬ 
ues  occurs  whenever  Bo  is  significantly  smaller  that  Do-  For  typical  applications  (IMU 
measurements)  this  requirement  is  not  restrictive. 

The  filters  from  [13]  should  be  allowed  some  data  to  initialize  and  we  suggest  that  these 
filters  be  iterated  for  at  least  500  data  points  before  parameter  estimation  is  started. 


Improved  Convergence  Rates 

The  rate  of  convergence  of  the  adaptive  algorithm  (4.2)-(4.4)  from  initial  parameter  es¬ 
timates  can  be  improved.  Using  the  difference  form  of  the  estimators,  ie.  (4.9)  etc., 
convergence  rates  can  be  improved  using  the  Polyak  technique  [19].  That  is, 

A+l|fc+l,Afc+1  =  A|fc,Afc  +  Tfc  {^°k\k,Ak  +  AHk\k,kk  ~  2A  Jk\k,Ak  ~  A|fc,Afc)  (5-2) 

where  jk  —  and  0  <  p  <  1  and  similarly  for  the  D  estimate.  The  estimates  are  then 
averaged.  The  convergence  results  still  hold. 

Other  techniques  such  as  forgetting  factors  can  also  be  useful[14]. 


Computational  effort 

The  computational  effort  required  to  implement  the  recursions  (4.2)- (4.4)  is  approximately 
100  floating  point  calculations  per  data  point  (and  does  not  depend^  on  the  length  of 
the  data).  This  includes  implementation  of  the  optimal  filters  for  Jk\k,Ak  etc-  Issues 
of  computational  effort  for  the  Allan  variance  technique  axe  not  discussed  in  the  given 
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references  and  will  depend  on  implementation  choices.  If  the  data  set  has  T  points,  then 
to  calculate  the  Allan  variance  of  size  N  requires  O  (T(l  +  3 /N))  (or  order (T(l  +  3/IV))) 
calculations.  Because  Allan  variance  is  plotted  on  a  log-log  graph,  the  Allan  variance 
needs  at  least  ATlog(T)  points  to  be  plotted  where  K  is  the  number  of  points  per  order 
of  magnitude.  The  computational  effort  is  hence  at  least  0(KT\og(T))  compared  to  the 
O(IOOT)  required  to  implement  (4.2)-(4.4). 

For  large  data  sets  the  estimators  presented  here  are  clearly  computationally  superior.  The 
exact  number  of  points  at  which  our  on-line  algorithm  is  computationally  superior  depends 
on  implementation  choices.  For  example,  if  the  Allan  variance  is  calculated  at  log  spaced 
points  and  K  =  10  starting  at  N  =  2  then  for  T  >  5000  (4.2)- (4.4)  is  computationally 
superior.  However,  if  calculated  at  log  spaced  points  and  K  —  5  then  the  Allan  variance 
technique  is  computationally  superior  until  T  >  4  x  106.  A  typical  data  set  might  contain 
24hrs  of  1Hz  measurements,  ie.  ~  107  data  points.  The  recursions  (4.2)- (4.4)  always  have 
the  advantage  that  the  data  does  not  need  to  be  stored. 


Numerical  Overflow 

If  implemented  in  an  on-line  manner  the  algorithm  will  need  to  be  modified  to  ensure 
overflow  of  Ok |fc  etc.  does  not  occur.  Scaling  or  forgetting  factors  could  be  introduced 
into  the  filters  of  [13].  Alternatively,  the  filters  could  be  reset  and  then  reinitialized  on 
data  to  avoid  numerical  overflow. 


6  Simulation  studies 


In  this  section  we  examine  the  proposed  technique  for  estimating  the  angular  random  walk 
and  rate  random  walk  coefficients.  First,  we  test  the  technique  on  computer  generated 
data  with  known  noise  characteristics.  Secondly,  we  test  the  technique  on  data  taken  from 
a  real  device,  namely  a  Honeywell  HG1700-AE11  IMU.  This  device,  contains  ring  laser 
gyros  and  produces  delta  angle  and  delta  velocity  signals  in  the  X,  Y  and  Z  directions. 
We  have  used  data  from  only  the  Y  direction  and  have  assumed  that  it  is  independent  of 
the  other  axes.  The  device  was  static. 


6.1  Computer  Generated  data 

These  computer  simulations  were  performed  using  the  MATLAB  package  and  matlab’s 
randn  normal  distributed  noise  sources. 

A  40000  point  sequence  was  generated  for  the  linear  system  (2.1), (2.2)  with  B  =  0.1 
and  D  —  1.  From  initial  estimates,  Bo  =  0.3  and  Do  =  1-5,  the  algorithm  (4.2)  was 
implemented  to  produce  new  estimates  of  the  B  and  D  coefficients.  The  final  estimates 
after  one  pass  through  the  data  were  B  =  0.1008  and  D  =  0.9991. 
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6.2  The  Honeywell  IMU  (HG1700-AE11) 

Navigational  data  arrives  from  the  Honeywell  IMU  at  600 Hz  and  100 Hz.  This  data  was 
averaged  to  1  Hz  before  any  estimation  was  performed  to  allow  recording  of  reasonable 
time  lengths  of  data.  Other  transformations  were  performed  on  the  data  to  scale  the 
output  but  these  are  not  important  in  this  discussion. 

The  recorded  data  were  analyzed  by  the  standard  Allan  variance  graph  techniques.  Figure 
5  shows  the  Allan  variance  graph  obtained  from  analysis  of  a  80000  data  point  set.  It 
appears  from  the  Allan  variance  plot  that  the  dominate  noise  sources  are  angular  random 
walk  and  rate  random  walk  which  justifies  our  noise  model  assumption  for  this  device. 

From  the  Figure  5  the  angular  random  walk  and  rate  random  walk  coefficients  can  be 
estimated,  using  the  technique  in  [4].  To  do  this,  lines  of  best  fit  are  drawn  on  the  parts 
of  log-log  plot  with  slope  —  l/2_and  +1/2.  Then  the  coefficients  can  be  calculated  as 
ARW  =  x4M2)  =  5  ±  1  and  RRW  =  v/^+M/v^  =  0.08(+0.02  -  0.04)  where  A-{N)  is 
the  value  of  the  —1/2  line  at  N  and  A+(N)  is  the  value  of  the  +1/2  line  at  sample  size 
N. 

The  technique  proposed  in  this  paper  is  an  alternative  method  for  calculating  these  two 
coefficients  without  producing  the  Allan  variance  graph. 

The  recursions  with  Polyak  step,  that  is  (5.2)  with  p  =  0.5  for  the  B  recursion  and  p  =  1 
for  the  D  recursion  and  the  model  updated  adaptively  according  to  (4.4),  were  used  to 
estimate  the  parameters  from  the  80000  point  set.  The  algorithm  was  initiated  with 
parameter  estimates,  Bo  =  0.1  and  Dq  =  20. 

The  parameter  estimates  after  one  pass  through  the  data  were  B  =  0.0045  and  D  =  31.09 
(or  equivalently  ARW  =  ^ft)  =  5.6  and  RRW  =  \/b  =  0.0668).  In  Figures  3  and  4  the 
evolution  of  parameter  estimates  versus  time  is  shown.  These  final  estimates  are  within 
the  error  bounds  of  the  estimates  from  the  Allan  variance  graph  as  shown  in  Table  1. 


Table  1  (U):  Comparison  of  Estimated  Parameters 


Allan  Variance 

On-line  Recursions 

ARW 

5  ±  1 

5.6 

RRW 

0.08(+0.02  -  0.04) 

0.067 

Various  other  initializations  (Bo  and  Do)  were  tried  including:  Bo  =  0.1  and  Do  =  1; 
Bq  =  0.3  and  Do  =  1.5;  amongst  others.  Convergence  occurs  when  Bo  <  A).  Note  that 
both  the  Allan  variance  graph  method  and  the  method  proposed  here  require  large  data 
sets  to  enable  estimation  of  B.  Estimation  of  D  can  be  done  on  smaller  data  sets. 


6.3  Discussion 
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Good  choices  for  p  in  the  step  modified  estimation  algorithm  will  depend  of  the  relative 
power  of  the  w  and  v  noises.  The  Honeywell  IMU’s  major  source  of  noise  is  angular 
random  walk.  Increasing  the  step  size  for  the  B  recursion  allows  the  recursion  longer  time 
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to  forget  initial  estimates  (relative  to  the  D  recursion)  and  this  facilitates  convergence 
from  poor  initializations. 


7  Conclusions 

This  paper  presented  a  new  technique  for  estimating  the  noise  model  of  IMU  gyros.  The 
existing  techniques  for  analysis  of  IMU  gyros  require  large  amounts  of  data  to  be  stored 
and  to  be  processed  off-line  to  produce  an  Allan  variance  graph.  The  technique  proposed 
here  can  analyze  data  on-line  as  it  arrives  from  the  gyro  and  can  produce  immediately 
estimates  of  the  noise  model  in  the  sense  that  it  does  not  require  the  user  to  manually 
analyze  an  Allan  variance  graph. 
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Figure  5  (U):  Allan  variance(standard  derivation)  versus  sample 
size  with  error  bars  and  lines  of  worst  fit. 
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